The most major difference between Python versions 2 and 3 is in string handling.
In Python 3 all strings are by default Unicode strings. The Python interpreter expects Python source files to be UTF-8 encoded Unicode strings.
What Unicode is beyond the scope of this course, but you can check
If you don't what Unicode or encodings are, do not despair. You will find out if you need to find out and people have lived their entire lives happily without knowing what encodings are.
Suffice to say that it is safe to use Unicode characters in strings and in variable names in Python 3.
In [ ]:
ananasakäämä = "höhö 电脑"
print(ananasakäämä)
In [ ]:
print("\N{GREEK CAPITAL LETTER DELTA}") # using the character name
print("\u0394") # using a 16-bit hex value
print("\U00000394") # using a 32-bit hex value
If you have a bytes-object you can call the decode()
method on it and give an encoding as an argument. Conversely you can encode()
a string.
Both single '' and double "" quotes denote a string. They are equally valid and it is a question of preference which to use. It is recommended to be consistent within the same file, though.
It is permissible to use single quotes inside a double-quoted string or double quotes inside a single quoted string.
If you want to have the same kind of quotes inside a string, you must escape the quotes in the string with a backslash \. As it is the escape character, any backslashes must be entered as double \\ to create a single literal backslash in a string.
In [ ]:
permissible = "la'l'a'a"
print(permissible)
permissible = 'la"l"a"a'
print(permissible)
permissible = "\"i am a quote \\ \""
print(permissible)
There are several ways to create multiline strings
In [ ]:
permissible = """
i am a multi
line
string
"""
print(permissible)
In [ ]:
permissible = ("i"
" am" #note the whitespace before the word inside the string
' a'
" multiline"
' string')
print(permissible)
First it is essential to remember that strings are immutable: whatever you do with a string, it will not change. Most methods on strings will return a new, modified string or some other object.
If you have any programming experience many of the following examples will seem familiar to you.
A complete list can, as always be found at the documentation.
In [ ]:
example = "The quick brown fox jumps over the lazy dog "
In [ ]:
## the split function splits at whitespace by default
example.split()
It can be given any parameter. Te return value is a list so it can be indexed with [].
In [ ]:
example.split("e")[0]
Strings can be indexed and sliced using the same notation as lists
In [ ]:
example[5:10]
The strip()
function removes the first and last instances of a character from the string, defaulting to whitespace. This is surprisingly often needed.
In [ ]:
example.strip()
Strings can be coerced to lower()
or upper()
case.
In [ ]:
example.upper()
Seeking a substring is also implemented with the find()
method. It returns an index.
In [ ]:
example.find("ick")
Sometimes it's important to know if a string is a digit or numeric.
In [ ]:
"124".isdigit()